Speeding up ResNet training
نویسنده
چکیده
Time required for model training is an important limiting factor for faster pace of progress in the field of deep learning. The faster the model training, the more options researchers are able to try in the same amount of time, and the higher the quality of their results. In this work we stacked a set of techniques to optimize training time of the ResNet model with 20 layers and achieved a substantial speed up relative to the baseline. Our best stacked model trains about 5 times faster than the baseline model.
منابع مشابه
Learning Deep Resnet Blocks Sequentially
We prove a multiclass boosting theory for the ResNet architectures which simultaneously creates a new technique for multiclass boosting and provides a new algorithm for ResNet-style architectures. Our proposed training algorithm, BoostResNet, is particularly suitable in non-differentiable architectures. Our method only requires the relatively inexpensive sequential training of T “shallow ResNet...
متن کاملScale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train
For the past 5 years, the ILSVRC competition and the ImageNet dataset have attracted a lot of interest from the Computer Vision community, allowing for state-of-the-art accuracy to grow tremendously. This should be credited to the use of deep artificial neural network designs. As these became more complex, the storage, bandwidth, and compute requirements increased. This means that with a non-di...
متن کاملLearning Deep ResNet Blocks Sequentially using Boosting Theory
Deep neural networks are known to be difficult to train due to the instability of back-propagation. A deep residual network (ResNet) with identity loops remedies this by stabilizing gradient computations. We prove a boosting theory for the ResNet architecture. We construct T weak module classifiers, each contains two of the T layers, such that the combined strong learner is a ResNet. Therefore,...
متن کاملSupplementary Material for “DualNet: Learn Complementary Features for Image Recognition”
Besides ResNet-20, we further evaluate DualNet based on the deeper ResNet [6], e.g., with 32 layers and 56 layers (denoted as ResNet-32&ResNet-56, referring to the third-party implementation available at [2]). ResNet32&ResNet-56, as well as the corresponding DualNet (denoted as DNR32&DNR56), are also trained on the augmented CIFAR-100 and the experimental results are shown in Table 1. The perfo...
متن کاملNeumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks
Progress in deep learning is slowed by the days or weeks it takes to train large models. The natural solution of using more hardware is limited by diminishing returns, and leads to inefficient use of additional resources. In this paper, we present a large batch, stochastic optimization algorithm that is both faster than widely used algorithms for fixed amounts of computation, and also scales up...
متن کامل